9 research outputs found

    Molecular Formula Identification using High Resolution Mass Spectrometry: Algorithms and Applications in Metabolomics and Proteomics

    Get PDF
    Wir untersuchen mehrere theoretische und praktische Aspekte der Identifikation der Summenformel von Biomolekülen mit Hilfe von hochauflösender Massenspektrometrie. Durch die letzten Forschritte in der Instrumentation ist die Massenspektrometrie (MS) zur einen der Schlüsseltechnologien für die Analyse von Biomolekülen in der Proteomik und Metabolomik geworden. Sie misst die Massen der Moleküle in der Probe mit hoher Genauigkeit, und ist für die Messdatenerfassung im Hochdurchsatz gut geeignet. Eine der Kernaufgaben in der MS-basierten Proteomik und Metabolomik ist die Identifikation der Moleküle in der Probe. In der Metabolomik unterliegen Metaboliten der Strukturaufklärung, beginnend bei der Summenformel eines Moleküls, d.h. der Anzahl der Atome jedes Elements. Dies ist der entscheidende Schritt in der Identifikation eines unbekannten Metabolits, da die festgelegte Formel die Anzahl der möglichen Molekülstrukturen auf eine viel kleinere Menge reduziert, die mit Methoden der automatischen Strukturaufklärung weiter analysiert werden kann. Nach der Vorverarbeitung ist die Ausgabe eines Massenspektrometers eine Liste von Peaks, die den Molekülmassen und deren Intensitäten, d.h. der Anzahl der Moleküle mit einer bestimmten Masse, entspricht. Im Prinzip können die Summenformel kleiner Moleküle nur mit präzisen Massen identifiziert werden. Allerdings wurde festgestellt, dass aufgrund der hohen Anzahl der chemisch legitimer Formeln in oberen Massenbereich eine exzellente Massengenaugkeit alleine für die Identifikation nicht genügt. Hochauflösende MS erlaubt die Bestimmung der Molekülmassen und Intensitäten mit hervorragender Genauigkeit. In dieser Arbeit entwickeln wir mehrere Algorithmen und Anwendungen, die diese Information zur Identifikation der Summenformel der Biomolekülen anwenden

    SIRIUS: decomposing isotope patterns for metabolite identification†

    Get PDF
    Motivation: High-resolution mass spectrometry (MS) is among the most widely used technologies in metabolomics. Metabolites participate in almost all cellular processes, but most metabolites still remain uncharacterized. Determination of the sum formula is a crucial step in the identification of an unknown metabolite, as it reduces its possible structures to a hopefully manageable set

    Identifying metabolites with integer decomposition techniques, using only their mass spectrometric isotope patterns

    Get PDF
    Böcker S, Letzel M, Lipták Z, Pervukhin A. Identifying metabolites with integer decomposition techniques, using only their mass spectrometric isotope patterns. Forschungsberichte der Technischen Fakultät, Abteilung Informationstechnik / Universität Bielefeld. Bielefeld: Technical Faculty, Bielefeld University; 2007.Metabolites, small molecules that are intermediates and products of the metabolism, participate in almost all cellular processes such as signal transduction and stress response. There exist several thousand metabolites for every species, the overwhelming majority still being uncharacterized. Mass spectrometry has become a method of choice to analyze the metabolites of a cell. High resolution mass spectrometry allows us to determine the mass and isotopic distribution of sample molecules with outstanding accuracy. Here, we provide a method to determine the sum formula of an unidentified metabolite (or, more generally, any chemical compound) solely from its mass and isotopic pattern. This is a crucial step in the identification of an unknown metabolite, as it reduces its possible structures to a finite and, hopefully, manageable set. In Part I, we show how to use integer decomposition techniques, introduced earlier by two of the authors, for decomposing real valued molecule masses, with large improvements over naïve methods that are currently best known for this problem. We then show how to rapidly match and rank simulated spectra against the measured spectrum. Our method is computationally efficient and can be applied to metabolites and other chemical compounds with mass up to 1000 Dalton. First results on experimental data indicate good identification rates for chemical compounds up to 700 Dalton. In Part II, we present our method for rapid computation of isotope distributions and mean masses of isotope peaks, i.e., for simulation of isotopic spectra, improving on best-known results. Fast simulation of isotope patterns is vital due to the large search space. Above 1000 Dalton, however, the number of molecules with a certain mass increases rapidly. Since the size of the search space thus becomes prohibitive, generating all potential solutions, simulating their isotope patterns, and matching them against the input is often not feasible. Instead, we define several "additive invariants" extracted from the input and then propose to solve a "joint decomposition problem": Given a finite weighted alphabet with character masses {a_1,...,a_sigma} and a query "m", a "decomposition" of "m" is a non-negative integer vector (c_1,..., c_sigma) such that sum_i c_i a_i = m. Here, we have the problem of finding a "joint" decomposition "c" for a set of queries, where each query has to be decomposed over a different weighted alphabet. We present an efficient algorithm for producing all joint decompositions of the query vector and demonstrate its fitness on real data extracted from a metabolite database

    Decomp \u2013 from interpreting Mass Spectrometry peaks to solving the Money Changing Problem

    No full text
    Summary: We introduce DECOMP, a tool that computes the sum formula of all molecules whose mass equals the input mass. This problem arises frequently in biochemistry and mass spectrometry (MS), when we know the molecular mass of a protein, DNA or metabolite fragment but have no other information. A closely related problem is known as the Money Changing Problem (MCP), where all masses are positive integers. Recently, efficient algorithms have been developed for the MCP, in which DECOMP applies to real-valued MS data. The excellent performance of this method on proteomic and metabolomic MS data has recently been demonstrated. DECOMP has an easy-to-use graphical interface, which caters for both types of users: those interested in solving MCP instances and those submitting MS data. Availability: DECOMP is freely accessible at http://bibiserv.techfak. uni-bielefeld.de/decomp
    corecore